Read PDF Text
AutomatR.DefaultActivities.PDF.ReadPDFText
The "Read PDF Text" activity in AutomatR is designed to extract text content from a specified PDF file. This activity utilizes the iText library to parse PDF files and retrieve text information, enhancing automation workflows that involve processing PDF documents.
Properties
Name | Description |
---|---|
Input | |
File Path | Specifies the path of the PDF file from which text content should be extracted. String variables containing the full path to the PDF file. |
Pages | Specifies the pages from which to extract text. You can specify individual page numbers separated by commas or use "all" to extract text from all pages. String variables containing page information. Example: "1,3,5" or "all". |
Password | Specifies the password for the PDF file if applicable. This is required if the PDF is password-protected. Object variables containing the password. |
Misc | |
Display Name | Provides a customizable name for the activity displayed in the workflow. The display name enhances clarity and organization within the automation project. String variables containing the desired display name. |
Optional | |
Delay | Specifies the amount of time (in seconds) to wait before executing the "Read PDF Text" activity. This delay can be useful for handling synchronization issues. Integer variables containing the delay duration. Example: If the amount of time is 5 seconds, enter 5. |
Output | |
Result | Outputs the text content read from the specified pages of the PDF file. String variables containing the extracted text. |
How to use:
- Drag and drop the "Read PDF Text" activity onto the workflow.
- Configure the properties by specifying the file path, password (if applicable), pages, and optional delay.
- Customize the display name for clarity in the workflow.
- Execute the workflow to read text content from the specified PDF file.
Example:
Consider an example where the "Read PDF Text" activity is used to extract text from pages 1, 3, and 5 of a PDF file named "document.pdf":
Read PDF Text:
Display Name: "Extract PDF Text"
File Path: "C:\Documents\document.pdf"
Password: "mypassword"
Pages: "1,3,5"
Result: pdfTextContent
In this example, the activity reads text content from pages 1, 3, and 5 of the specified PDF file, and the extracted text is stored in the variable "pdfTextContent" for further use in the workflow.